Learning to Rerank Top-K Schema Matches

نویسندگان

  • Avigdor Gal
  • Haggai Roitman
  • Roee Shraga
چکیده

We propose a learning algorithm that utilizes an innovative set of features to rerank a list of top-K schema matches and improves upon the ranking of the best match. We provide a bound on the size of an initial match list, tying the number of matches in a desired level of confidence with finding the best match. We also propose the use of matching predictors as features in a learning task, and tailored nine new matching predictors for this purpose. A large scale empirical evaluation with real-world benchmark shows the effectiveness of the proposed algorithmic solution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Match Schemata using Predictors

We propose a learning algorithm that utilizes an innovative set of features to re-rank a list of top-K matches and improves upon the ranking of the best match. We provide a bound on the size of an initial match list, tying the number of matches in a desired level of confidence for finding the best match. We also propose the use of schema matching predictors as features in the learning task, and...

متن کامل

Actively Soliciting Feedback for Query Answers in Keyword Search-Based Data Integration

The problem of scaling up data integration, such that new sources can be quickly utilized as they are discovered, remains elusive: global schemas for integrated data are difficult to develop and expand, and schema and record matching techniques are limited by the fact that data and metadata are often under-specified and must be disambiguated by data experts. One promising approach is to avoid u...

متن کامل

Ensemble-based Top-k Recommender System Considering Incomplete Data

Recommender systems have been widely used in e-commerce applications. They are a subclass of information filtering system, used to either predict whether a user will prefer an item (prediction problem) or identify a set of k items that will be user-interest (Top-k recommendation problem). Demanding sufficient ratings to make robust predictions and suggesting qualified recommendations are two si...

متن کامل

Point-Wise Approach for Yandex Personalized Web Search Challenge

The paper describes a solution for the Yandex Personalized Web Search Challenge. The goal of the challenge is to rerank top ten web search query results to bring most personally relevant results on the top, thereby improving the search quality. The paper focuses on feature engineering for learning to rank in web search, including a novel pair-wise feature, shortand long-term personal navigation...

متن کامل

TOP: A Compiler-Based Framework for Optimizing Machine Learning Algorithms through Generalized Triangle Inequality

This paper describes our recent research progress on generalizing triangle inequality (TI) to optimize Machine Learning algorithms that involve either vector dot products (e.g., Neural Networks) or distance calculations (e.g., KNN, KMeans). The progress includes a new form of TI named Angular Triangular Inequality, abstractions to enable unified treatment to various ML algorithms, and TOP, a co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018